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J>.^' Recent vi^ork has revealed a new class of "zero-determinant" (ZD) strategies for iterated, 

^2 , two-player games. ZD strategies allow a player to unilaterally enforce a linear relationship 

^ ■ betw^een her score and her opponent's score, and thus achieve an unusual degree of control 

(^ over both players' long-term payoffs. Although originally conceived in the context of classi- 

cal, t^vo-player game theory, ZD strategies also have consequences in an evolving population 

[jM \ of players. Here ^ve explore the evolutionary prospects for ZD strategies in the Iterated 

P^ • Prisoner's Dilemma. Several recent studies have focused on the evolution of "extortion 

Q . strategies" — a subset of zero-determinant strategies — and found them to be unsuccessful in 

'^ I populations. Nevertheless, we identify a different subset of ZD strategies, called "generous 

I ' ZD strategies," that tend to dominate in evolving populations. For all but the smallest 

I I . population sizes, generous ZD strategies are not only evolutionarily stable, but they also 

] can invade and replace any other ZD strategy. Furthermore, generous ZD strategies are 

<^ ' robust to invasion by any non-ZD strategy. When evolution occurs on the full set of all 

lO ' possible IPD strategies, selection disproportionately favors generous ZD strategies. In some 

^^ , regimes, generous ZD strategies outperform even the most successful of the well-known IPD 

r^^ ■ strategies, including Win-Stay-Lose-Shift. 
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Introduction 



Press and Dyson [1] recently revealed a remarkable class of strategies, called "zero-determinant" (ZD) 
strategies, for iterated two-player games. ZD strategies are of particular interest in the Iterated Pris- 
r> ■ oner's Dilemma (IPD), the canonical game used to study the emergence of cooperation among rational 
C^ , individuals [2^5]. By allowing a player to unilaterally enforce a linear relationship between her score and 
her opponent's score. Press and Dyson argue, ZD strategies can allow a sentient player unprecedented 
control over the outcome of IPD games. In particular. Press and Dyson highlighted a special subset of ZD 
strategies, called "extortion strategies," that grant the extorting player a disproportionately high payoff 
when employed against a naive player who blindly adjusts his strategy to maximize his own score. 

A natural response to Press and Dyson [10] is to ask: What are the implications of ZD strategies for an 
evolving population of players? Although several recent papers have begun to explore this question |lipi2) . 
they have focused almost exclusively on extortion strategies. Extortion strategies are not successful in 
evolving populations, unless the population size is very small. Like all strategies that prefer to defect 
rather than to cooperate, extortion strategies are vulnerable to strategies that reward cooperation but 
punish defection. However, there is more to ZD strategies than just extortion. Here, we consider the 
full range of ZD strategies and we show that, when it comes to evolutionary success, it is generosity, not 
extortion, that rules. 

We start our analysis by showing that a population restricted to the space of ZD strategies will 
always evolve to a special subset of strategies, which we call "generous ZD strategies" {Gy.). Generous ZD 



strategies are qualitatively similar to the well-known strategy generous tit-for-tat (GTFT). Like GTFT, G^ 
cooperates with cooperators, and mostly punishes defectors, but it will occasionally forgive and cooperate 
with even the most intransigent defector. We show that G^ is robust to invasion and replacement by any 
other strategy; it can be replaced only neutrally and only by strategies that agree to mutual cooperation. 
We explore, both analytically and numerically, how successful G^ is at invading other strategies. We find 
that generous ZD strategies are just as, or sometimes even more, successful than the most successful of 
well-known IPD strategies in large populations. Finally we show that a population evolving on the full 
set of IPD strategies spends a disproportionate amount of time near G^ strategies, indicating that they 
are favored by evolution. 

Methods and Results 

In the Prisoner's Dilemma (PD), players X and Y must each simultaneously choose whether to cooperate 
(c) or defect (d). If both players cooperate {cc) then they each receive payoff R. If X cooperates and Y 
defects (cd), then X loses out and receives the smallest possible payoff, S", while Y receives the largest 
possible payoff, T. If both players defect {dd), then both players receive payoff P. Payoffs are specified so 
that the reward for mutual defection is less than the reward for mutual cooperation, i.e. T > R > P > S. 
In addition, it is typically assumed that 2R > T + S, so that it is not possible for both players to receive 
payoffs exceeding R. 

The Iterated Prisoner's Dilemma (IPD) consists of many successive rounds of the PD game. In order 
to understand a player's longterm score. Press and Dyson yy showed that it is sufficiant to consider 
only the set of memory-1 stragies, i.e. strategies that specify the probability a player cooperates each 
round in terms of the payoff she received in the previous round. Memory-1 strategies thus consist of four 
probabilities, p = {pr,PStPt,Pp}- More specifically. Press and Dyson [1] showed that the payoff to a 
memory-1 player pitted against an opponent with an arbitrary memory is the same as her payoff would 
be against some other, memory-1 opponent. 

In the context of evolutionary game theory, we consider a population of A^ individuals each charace- 
terized by a strategy p. The evolutionary success of a strategy depends on its payoff when pitted against 
all players in the population [TSHIS]. As per Press &: Dyson |lj, it is sufficient to consider only the set 
of memory-1 strategies to determine the longterm payoff to a memory-1 player. However, there is an 
additional complication in a population with N > 2, because the evolutionary success of a memory-1 
player also depends on the success of her opponents (who may, in general, have long memories) playing 
against each other. Nonetheless we will show that no strategy, no matter how long its memory, can do 
better than generous ZD strategies in large populations. 

Evolution within the set of ZD strategies 

Among the space of all memory-1 IPD strategies. Press and Dyson identified a subspace of so-called 
"zero-determinant" strategies that ensure a fixed, linear relationship between the longterm payoffs of the 
two players. In this section we analyze how a population evolves when restricted to the space of ZD 
strategies. 

We denote the longterm payoff of IPD player X by Sxy, and of player Y by Syx- If player Y employs 
a ZD strategy, then the payoffs satisfy the linear relationship 



^yx 



X {Sxy - k) . (1) 



In these terms, the extortion strategies of Press and Dyson, denoted E^., are defined as the subset of ZD 
strategies with x ^ 1 and k = P. 

When one ZD strategy {kx,Xx} is pitted against another ZD strategy {Ky,Xy} we can easily derive 
the resulting payoffs (Table 1, and Appendix). Given these payoffs, it is simple to show that a resident 



Table 1: Payoff matrix for two ZD players (assuming x > 1) 





X 








Y 








X 


Hx 








K.x{Xx 


XxXy- 


i^yiXy- 
-1 


ill 


Y 


i^yiXy- 


-^)+Xy 
X^Xy- 


-1 


-1) 


Ky 









ZD strategy Y with Xy > 1 in a population can be invaded by a mutant ZD strategy X if and only if 
Kx > Ky (as we show in the Appendix, this holds for all Xx)- As a result, we see immediately that extortion 
strategies are not evolutionarily stable - because they can be invaded by any ZD strategy with k > P. 
Amongst ZD strategies with x > !> it is those strategies with a larger value of k that are evolutionarily 
successful. The maximum possible value of k is i? and, indeed, the set of ZD strategies with k = R and 
X > 1 are evolutionary stable against all ZD strategies with k < R (see Appendix). This special subset 
of ZD strategies can be written down explicitly: 
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where we have formulated the IPD as the donation game [12] with payoffs T = B, R = B — C, P = and 
S = —C. The parameter (j) (which does not affect the payoff received by either player) lies in the range 
0<<?^<(^ + x) • Eq. 2 defines the full range of ZD strategies with k = R and x > 1; which we call 
"generous ZD strategies" and denote G^^. G^ strategies are ESS within the space of ZD strategies with 
X > 1; and, conversely, any strategy with k < R is vulnerable to invasion by a member of G^ (Fig. SI 
and Appendix). G^ strategies are generous because they reward cooperation by cooperating, and they 
punish defection. However, unlike strategies such as tit-for-tat (TFT), generous ZD strategies will also 
occasionally cooperate with a player who has defected. The well-known strategy GTFT, which is defined 
by p = {1, 1 — G /B, 1, 1 — C/B}, corresponds to a G-^^ strategy with x ^ oo. 

To illustrate the fact that G^ forms an evolutionarily stable subset of ZD strategies with x ^ Ij we 
performed simulations of a well-mixed, finite population of IPD players. Following |121I17) we modeled 
selection as a process in which individuals copy successful strategies with a probability dependent on their 
payoffs (see Appendix). As Fig. 1 shows, evolution within the set of ZD strategies proceeds from extortion 
(k = P) to generosity (k = R). 

The evolutionary stability of G^ strategies 

We have shown that Gy- strategies are evolutionarily succesful and eventually dominate in a population, 
when players are confined to the space of ZD strategies. But how does G^. perform against the space of 
all IPD strategies? To answer this question in general, we have derived the conditions under which an 
arbitrary memory-1 strategy can invade and eventually replace Gy., in both infinite and finite populations 
(see Appendix). 

In an infinite population, G^ is evolutionarily stable within a very broad class of memory-1 IPD 
strategies. To see this, suppose that a resident G^ strategy Y faces an arbitrary mutant strategy X. X 
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Figure 1 - Evolution from extortion to generosity within the space of ZD strategies. Populations were 
simulated in the regime of weak mutation. The figure shows the ensemble mean value of k in the popu- 
lation, plotted over time, k = corresponds to the extortion strategies of Press &: Dyson (S^), whereas 
K = B — C corresponds to the generous ZD strategies (G^). Each population was initialised at an ex- 
tortion strategy E-^, with x drawn uniformly from the range x ^ [1) 10]- Given a resident strategy in the 
population, mutations to {k, x, 0} were proposed as normal deviates of the resident strategy, truncated 
to constrain k G [P, i?], x ^ 1; and (j) within the feasible range given k and x- A proposed mutant strat- 
egy replaces the resident strategy with a fixation probability calculated using their respective payoffs, as 
in [T2|[T7]. The mean k among 10^ replicate populations is plotted as a function of time. (The value of x 
was observed to evolve neutrally, Fig. S2.) Parameters are B = 3, C = 1, N = 100, and selection strength 

(7 = 1. 

receives some payoff Sxx when playing against itself, and some payoff Sxy when playing against Y. Because 
G^ enforces a linear relationship between the players' scores (Eq. 1), we can easily calculate the payoff 
matrix between X and Y in terms of these two payoffs (Table 2) . Table 2 shows that G^ resists invasion 
(that is, the invader X receives lower payoff against the resident Y than Y does against itself) provided 
Sxy < B — C. In other words, Gy. is evolutionarily stable against all memory-1 strategies that receives a 
payoff less than B — G when pitted against Gy. On the other hand, it is not possible for X to receive a 



payoff larger than B 



C, because if Sxy > B 



G then Eq. 1 (with k = B — G) implies that Syx > B — G 



as well, which contradicts the usual IPD stipulation that 2R > T + S. As a result, the best that X can 
do against Y is to agree to mutual cooperation (i.e. pR = 1), so that both players receive longterm payoff 
B — G. In summary, Gy. is evolutionarily stable against any memory-1 strategy that does not agree to 
mutual cooperation, and G^ is neutral against any strategy that is mutually cooperative; in other words, 
G^ is Nash in an infinite population (see also [18]). 

Table 2: Payoff matrix for a Gy player pitted against an arbitrary opponent 
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In a finite population, the ESS condition is not sufficient to fully understand the evolutionary dynamics 



of the IPD |191l20j. Players that do well against themselves, but poorly against a resident strategy, can 
still weakly dominate and replace the resident strategy, as occurs when Always Defect (ALLD) is resident 
and TFT invades [20]. General conditions for evolutionary stability in a finite population can be found 
under weak selection [2D]. Applying these conditions (see Appendix), we find that G^ resists initial 
invasion by another strategy provided Sxy < B — C (the ESS condition) and x < -^ + 1- In other words, 
G^ cannot be invaded in a finite population provided x is not too large. Furthermore, we find that G^ 
resists replacement by another strategy provided Sxy < {B — C) (the ESS condition) and x < "^n-i ■ 
In summary, G^ cannot invaded or be replaced by any non-cooperative strategy, except in very small 
populations, provided x < 2. These results hold even against long memory strategies (see Appendix). 

Given these conditions for evolutionary stability in a finite population, we focus in what follows on 
those Gy^ strategies with 1 < x < 2 - that is, the subset of ZD strategies that cannot be invaded and 
replaced by non-cooperative strategies, even in a finite population. Note that the constraint x ^ 2 
excludes GTFT from the set of evolutionarily successful G^ strategies, since GTFT has x — ^ oo. 

Aside from evolutionary stability we can also ask the converse question: how adept is a new mutant 
Gy^ at invading and replacing a arbitrary resident strategy? Using the payoffs from Table 2 we can find 
conditions for G^ to invade and replace a resident strategy (see Appendix). Against a resident strategy 
that agrees to mutual cooperation {i.e. one with ppt = 1), G^^ is again found to be neutral, so that Gy^ 
can invade only by genetic drift. However G^ is selected to invade and replace many defector strategies, 
including ALLD (see Appendix). 

The evolutionary success of G^ against classic IPD strategies 

To complement the general stability results described above, we compared the performance of generous 
ZD strategies against several classic IPD strategies, in a finite population of players [T2l[T7l[T9l[20] . 

Using the payoffs for the donation game, and assuming that there is some "error rate", e, in the 
move played by each strategy at each time step [21], we obtained the payoffs shown in Table 3, in the 
limit e ^ 0. As Table 3 shows, G^ fares very well against most opponents, and so too does the strategy 
Win-Stay-Lose-Shift (WSLS). WSLS is well known as an evolutionarily successful IPD strategy [7t[22]. 
and it was previously shown to easily dominate extortion strategies E^ [12]. As we have seen above, G^^ 
also easily dominates E^. G-^ and WSLS cooperate with each other, and thus are neutral against each 
other. WSLS is more successful than G^ against ALLC, whereas Gy. is more successful against defector 
strategies such as ALLD. Thus G^ and WSLS each have certain advantages. 

Table 3: Pairwise comparison of IPD strategies 
TFT WSLS Ey^ Gy^ AUG AllD 
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To illustrate the performance of G^ compared to classic IPD strategies, we performed stochastic 
simulations with different subsets of strategies, similar to those of [12]. In these simulations, a pair of 
individuals is chosen at each time step, and the first individual copies the strategy of the second with 
a probability that depends on their respective payoffs (see Fig. 2) pT]. Mutations also occur, with 



probability //, so that the mutated individual randomly adopts another strategy from the set of strategies 
being considered. We ran simulations at a variety of populations sizes, ranging from from A^ = 2 to 
A^ = 1,000. At very small population sizes, ALLD and E^ tend to dominate, reflecting the fact that 
extortion pays in the classic two-player setting |lj. However, as the population size increases Gy. and 
WSLS quickly begin to dominate. In competitions including multiple strategies, WSLS and Gx are both 
present at high frequency, with a slight advantage to WSLS when ALLC is present, and a slight advantage 
to Gx when ALLC is absent. Which strategy does best depends on the population size, the mutation 
rate, and the set of available strategies (Fig. 2). Nonetheless, there can be little doubt that generous ZD 
strategies are remarkably successful in evolving populations. 
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Figure 2 - Competition between generous ZD strategies and various sets of classic IPD strategies. For a 
range of population sizes A^, we plot the frequency of each strategy under mutation-selection balance. The 
left-hand column shows the limit of rare mutations /i — >• (determined analytically), and the right-hand 
column shows equilibrium frequencies with mutation rate /x = 0.05 (determined by stochastic simulation). 
Selection was implemented according to [ISKlTj, as in Fig. 1. ALLD dominates WSLS in head-to- head 
competition, whereas G^ dominates ALLD in head-to-head competition, except in very small populations. 
In competitions including more strategies, WSLS and Gx are both present at high frequency, with a 
slight advantage to WSLS when ALLC is present, and a slight advantage to Gx when ALLC is absent. 
Parameters are B = 3, C=l,x = l-5 and selection strength a = 1. 



The evolutionary success of Gy against all IPD strategies 



A more systematic way to query the evolutionary success of G^ strategies is to allow a population to explore 
the full set of memory- 1 strategies p = {pRiPSiPTiPp}- Following [l2l[23], we performed simulations in 
the regime of weak mutation, so that the population is monomorphic for a single strategy at all times. 
Mutant strategies, drawn uniformly from the space {pr^PSiPt-,Pp}i are proposed at rate /i. A proposed 
mutant strategy either immediately fixes or is immediately lost from the population, according to its 
fixation probability calculated relative to the current strategy in the population |12lll7j . 

Over the course of this simulation, we quantified how much time the population spends in a 5- 
neighborhood of Gy^ strategies. The 5-neighborhood of a strategy set is defined as those strategies within 
Euclidean distance 6 of it, in the space of all memory-1 strategies. If the proportion of time spent in 
the 5-neighborhood of a strategy is greater than would be expected by random chance (i.e. exceeds the 
volume of the (5- neighborhood) , then evolution is said to favor the strategy. 

Hilbe et. al showed that, except for very small populations, a population spends less time near 
extortion strategies than expected by random chance, and that the same is true for the set of all ZD 
strategies |12j . Thus, extortion and ZD strategies in general are disfavored by evolution in populations. 
This has led to the view the ZD strategies are of importance only in the setting of classical two-player 
game theory, and not in evolving populations [2^. In Fig. 3 we repeat this analysis but we additionally 
report the ^-neighborhood of Gy^ strategies. We find that, except in very small populations, selection 
strongly favors G^ strategies, with populations spending roughly five-times longer in the neighborhood of 
G^ than expected by random chance. 
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Figure 3 - Generous ZD strategies are favored by selection in an evolving population. We simulated 
a population under weak mutation, proposing mutants drawn uniformly from the full set of memory-1 
IPD strategies. We calculated (top) the time spent in the 5 neighborhood [12] of ZD and extortion 
strategies, as well as generous ZD strategies, relative to their random (neutral) expectation. We also 
calculated (bottom) the average strategy of the population. For small population sizes, extortioners are 
abundant and generous ZD strategies are nearly absent. As population size increases, the frequency of Gy- 
is strongly amplified by selectio; whereas extortion strategies, and ZD strategies in general, are disfavored, 
as previously reported [12]. The average strategy of the population is a mix of WSLS and G^. Simulations 
were run until the population fixed 10^ mutations. Parameters are i? = 3, C = 1 and selection strength 
a = 100. 



Gy compared to WSLS 



We have seen that both G-^ and WSLS are remarkably successful in evolving populations, but they 
have distinct advantages in different situations. Fig. 4 provides another perspective on the differences 
between these two types of strategies, by quantifying how successful they are at invading random memory- 
1 strategies, and conversely how successful random strategies are at invading G^ or WSLS, in a finite 
population. 

Fig. 4 confirms the fact, derived above, that neither WSLS nor G^ for 1 < x ^ 2 can be invaded by 
any other IPD strategy with a probability that exceeds the neutral replacement probability, 1/A^. In this 
sense, no IPD strategy is selectively favored to replace either WSLS or Gy^ (for 1 < x < 2). Analogous 
simulations (Fig. S3) illustrate that when x > 2 then G-^ can indeed be invaded by other strategies with 
probabilities exceeding 1/A^. And so GTFT does not enjoy the privileged position that Gi<;^<2 enjoys. 

Fig. 4 also quantifies the ability of Gj^- and WSLS to invade random, resident strategies. Gy^ performs 
better than neutral {i.e. displaces the resident with probability exceeding 1/A'^) against 96% of resident 
strategies, for x near 1; WSLS on the other hand performs better than neutral against 83% of resident 
strategies. But the two strategies do not necessarily perform well against the same sets of resident 
strategies. For example, Gy^ displaces ALLD, and WSLS displaces ALLC, but not vice versa. Moreover, 
the distribution of displacement probabilities for WSLS is broader than that of Gy^., meaning that when 
WSLS does well as an invader it does very well, and when it does badly it does very badly; whereas G^ 
is more consistent as an invader. 
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Figure 4 - Both G^ and WSLS can invade but cannot easily be invaded by other IPD strategies. We 
computed the probability prcs that a resident G^ or WSLS strategy will be displaced by a random mutant 
strategy drawn uniformly from the space of memory-1 strategies (top). Neither strategy is ever invaded 
with a probability exceeding the neutral replacement probability, 1/A^. Conversely, we also computed 
the probability pinv that a mutant G^ or WSLS strategy will invade and fix in a population comprised 
of a random memory-1 strategy. WSLS can invade with a probability greater than neutral in ~ 83% 
of cases, while and G^ can invade up to ~ 96% of cases, depending on the choice of x- We drew 10^ 
strategies uniformly from the space of all memory-1 strategies, p = {pr,Ps,PTiPp} and computed the 
fixation probabilities in a population of size N = 100 . The x-axis shows log;^o[-^/o], so that corresponds 
to the neutral fixation probability, negative values correspond to selection against fixation, and positive 
values to selection for fixation. Parameters are B = 3, G = 1, x = 1-01, and selection strength a = 1. 



Discussion 

We have shown that generous ZD strategies are important in evolving populations of players. Gy. is 
particularly successful when mutations arise at an appreciable rate, because it efficiently purges other 
strategies such as TFT and ALLD. Under such circumstances, G^ can dominate even Win-Lose-Stay- 
Shift, a perennial favourite in evolutionary populations. Over all, selection strongly favors G^ when 
evolution proceeds in the full space of memory- 1 strategies. These results strongly contravene the view 
the ZD strategies are of little evolutionary importance [23]. In fact, we have shown that a subset of ZD 
strategies, the generous ones, are strongly favored in the evolutionary setting. 

Generous ZD strategies were not explicitly considered by Press &: Dyson. However, an example of a 
Gy. strategy was discussed in our commentary on their paper jTO], and found to receive a high score in 
a round-robin tournament like that performed by Axelrod [3]. The ability to identify Gy. strategies, of 
course, requires an understanding of ZD strategies. As mentioned above, G^^ strategies are qualitatively 
similar, although seldom identical, to GTFT. In fact, GTFT is a G^ strategy with x ~^ oo, and it is at 
an evolutionary disadvantage compared to G^<2 strategies. Furthermore, whereas GTFT was proposed 
ad hoc, Gy was uncovered by identifying the evolutionary stable subset of ZD strategies, and so it can be 
understood in a principled and general way. 

The success of G^ does not change the fact that ZD strategies, when considered as a whole, are 
disfavored by selection in large poulations [12]. This is unsurprising, perhaps, given the breadth of 
strategies comprised by the ZD subspace. It is interesting to note that, aside from GTFT, many other 
classic strategies belong to the ZD set, including TFT, ALLC, ALLD and Miser [12]. Nonetheless, WSLS, 
the most evolutionarily successful of the classic IPD strategies, is not itself a ZD strategy, and it has 
certain advantages and deficits compared to Gx- 

The discovery and elegant definition of ZD strategies remains a remarkable achievement, especially in 
light of decades worth of prior research on the Prisoner's Dilemma in both the two-player and evolutionary 
settings. It is plausible, perhaps even likely, that amongst the majority of IPD strategies that are not 
ZD, there are other undiscovered subsets that, like Gy, are evolutionarily important; however, without 
the tractability the comes with a ZD strategy such subsets will be difficult to identify. 

Appendix 

Evolutionary simulations 

We simulated a well-mixed population in which selection follows an "imitation" process [121117] . At each, 
discrete time step, a pair of individuals {X, Y) is chosen at random. X switches its strategy to imitate Y 
with probability fx^y, 

""^^ 1 + exp[<T(Sa; - Sy)] ' 

where Sx and Sy denote the average IPD scores of players X and Y against the entire population, and 
a denotes the strength of selection. A mutation occurs with probability fi, so that the mutated player 
adopts a strategy chosen uniformly from set of available strategies. 

Evolution within the set of ZD strategies 

We define ZD strategies as those that enforce a linear relationship between the score of player Y and that 
of her opponent X: Syx — k. = xi^xy — ^)- If two players meet and each plays a ZD strategy, then the 
payoffs they receive are solutions to the equations 

^xy ^x — X,x \Syx l^x) 
Syx l^y — Xj/ (^xy l^y ) ■ 
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Assuming x > 1 the resulting payoffs are 



Sxy 



Syx 



KxjXx - 1) +XxKy{Xy - 1) 

XxXy - 1 
l^yiXy - '^) + XyKxiXx -1) 

XxXy - 1 



and the resulting payoff matrix for the two players is given in Table 1. From this, we find that strategy Y 
is an evolutionary stable strategy (ESS) if and only if Ky{xx — 1) > f^xiXx — !)• In other words, whenever 
Ky > Kx then strategy X does less well against Y than Y does against itself. 

There also exist viable ZD strategies with x < provided |x| > max f ^^, ■;^^ ) and {B — C)>k>0. 
When a ZD player Y with Xy ^ 1 meets a ZD player X with Xx < they receive the payoffs 

_ Kx{\Xx\ + 1) + \Xx\Ky{\Xy\ - 1) 

lx.llx.l + 1 

\Xy\tix{\Xx\ + I) - Kyi\Xy\ -1) 

''" " lx.llx.l + 1 

Using these scores, we find that Y is an ESS provided 

(IXxl + 1)(«^?/ - i^x) > 0. 

Since G^^ strategies are defined hy k = B — C and x > 1; they cannot be invaded by any other ZD 
strategy, regardless of the value of x of the invading strategy. Therefore Gy. forms an evolutionary stable 
subset of ZD strategies, with all members neutral to one another, but resistant to invasion by all strategies 
outside Gy. Conversely, any strategy that does not belong to G^ and has k < R is vulnerable to invasion 
by any member of G^ (see below). However, G^ is neutral against strategies with x < and k = R. 

Evolutionary stability of G^ in a finite population 

In a finite population, it is not always sufficient to look for an ESS to understand the evolutionary 
dynamics ^20j . However under weak selection it is possible to find explicit criteria for a resident strategy 
to resist invasion and replacement by another strategy. A resident strategy Y resists invasion by a strategy 
Xiff 

Sxy{N - 1) < Syx + Syy{N - 2) 

Moreover, Y resists replacement by X iff 

Sxx{N - 2) + Sxy{2N - 1) < Syx{N + 1) + 2syy{N - 2). 

The resident strategy Y is said to be ESS in a finite population when both of these conditions are met. We 
now apply these two conditions to a resident strategy Gx, which allows us to exploit the linear relationship 
between Sxy and Syx- Using the scores from Table 2 immediately provides the condition for G^ to resist 
initial invasion: 

(N-l-x){sxy-{B-C))<^. 

In other words, Gy will resist initial invasion provided the standard, infinite-population ESS condition is 
met {sxy < B — C) and also x < -^ ~ 1- Furthermore, Y will resist replacement by X provided 

{2N-1-NX + x){sxy -{B- G)) <iN- 2){{B - C) - Sxx)- 
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Irrespective of Sxx, this replacement condition is satisfied provided both the standard ESS condition is 
met {sxy < B — C) and additionahy x < (2-/V — 1)/(A^ — 1). Therefore, in total, G^^ cannot be selectively 
invaded provided s^y < {B - C) and x < (2iV - 1)/{N - 1) and N > 2. 

We can also analyze the conditions under which Gy. is able to invade another resident strategy. Again 
we exploit the fact that ZD strategies force a linear relationship between the two players' scores. Using 
the payoffs from Table 2, a strategy X can be invaded by G^ in a population with A^ ^ 1 if 

Sxy >{B-C)- [{B -G)- Sxx] - 

K 

and selection favors replacement by G^ if 

Sxy >{B-G)- [{B -G)- Sxx] Y^- 

There are no general constraints on x that will guarantee these conditions for an arbitrary resident strategy 
X. Nonetheless, we can make several statements in specific cases. For a resident cooperative strategy X 
with Sxx = B — G both conditions reduce to Sxy > B — G - so that Gy^ can neutrally invade strategies 
that cooperate with it. By contrast, a resident pure defector strategy, such as ALLD, has Sxx = 0. In 
such cases, when x — ^ 1, Gy, can invade neutrally and selectively replace the resident defector strategy - 
i.e. G^ weakly dominates ALLD in a similar way to TFT. However, when x = 2 selection acts against 
invasion by G^, but favors replacement - meaning that G^ will dominate ALLD if it is already present 
in the population. 

Applying these results to ZD strategies X with \xx\ < 0, we find that a G^ strategy Y is selected to 
replace X if 

{\x.\ + S-2\xymB-G)-Kx)>0 

when N ^ 1. This is satisfied when 

I I / \Xx\ +3 
\Xy\ < ^ 

The smallest possible value for \xx\ is \Xx\ = 1, so that G^ is selected to replace any X provided \xy\ < 2. 
This is the same condition for the evolutionary success of Gy- generally, and explains why Gy dominates 
in the set of all ZD strategies (Fig. SI). 

Long-memory strategies 

Our results for the evolutionary success of G^ in a finite population also hold against longer memory 
strategies. As per Press &: Dyson, from the perspective of a memory-1 player, a long-memory opponent 
is equivalent to a memory-1 opponent. Thus the payoff Sxy can be determined by considering only the set 
of memory-1 strategies. However the payoff a long memory player receives against itself, Sxx, cannot be 
understood by considering only the set of memory-1 strategies (because both players have long memories). 
Nonetheless, under the standard IPD assumption, 2R > T + S, the highest total payoff for any pair of 
players in the IPD is 2R and, for a pair of identical players (with the same genotype), this means that 
Sxx ^ R- Thus all our results on the invasibility of Gy continue to hold, even against long-memory 
resident strategies. 
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Supplementary figures 
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Figure SI - Generous ZD (G^) strategies tend to dominate all other ZD strategies in an evolving population, 
including against x < strategies. Top - Populations initialised with a random ZD strategy (left) or with k = P 
and X < (right) nonetheless evolve towards the maximum k = R associated with G^. Middle - The variance 
in K shrinks to ^ 0, indicating that all evolutionary paths converge to k ^ R. Bottom - All populations evolve 
strategies with the positive x associated with the G^ strategies. Parameters are N = 100, B = 3, C = 1 and 
selection strength a — 100. Mutant strategies were drawn uniformly at random from the space of all ZD strategies 
and replaced the resident strategy with the standard fixation probability 
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Figure S2 - Evolution of x within the space of ZD strategies with x ^ 1- Populations were simulated in the regime 
of weak mutation. The figure shows the ensemble mean value of x in the population, plotted over time, along with 
a sample path for a single evolution. Each population was initialised at an extortion strategy £'^, with x drawn 
uniformly from the range x ^ [IjlO]- Given a resident strategy in the population, mutations to {k,x, 0} were 
proposed as normal deviates of the resident strategy, truncated to constrain k G [K,P], x > 1, and (j) within the 
feasible range given k and x- A proposed mutant strategy replaces the resident strategy with a fixation probability 
calculated using their respective payoffs. The mean x among 10^ replicate populations is plotted as a function of 
time: it evolves neutrally. Parameters are i? = 3, C=l,iV = 100, and selection strength a = 1. 



16 



0.01 



Probability of being invaded by anotlier strategy 




-40 -30 -20 -10 

Probability of invasion (log^g[p^ggN]) 

Figure S3 - Gx be selectively invaded by other IPD strategies for x > 2. We computed the probability pros that 
a resident G^^ or WSLS strategy will be displaced by a random mutant strategy drawn uniformly from the space 
of memory-1 strategies. Neither strategy is ever invaded with a probability exceeding the neutral replacement 
probability, 1/iV, provided x ^ 2. However, strategies with larger values of x (gray line can frequently be invaded. 
We drew 10^ strategies uniformly from the space of all memory-1 strategies, p = {pr^PStPt^Vp} and computed 
the fixation probabilities in a population of size N = 100 . The x-axis shows Xogi^lN p], so that corresponds to the 
neutral fixation probability, negative values correspond to selection against fixation, and positive values to selection 
for fixation. Parameters are B ~ i, C ~ \, x — 1-01, and selection strength a — \. 
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